Diffusion-based DNA target colocalization by thermodynamic mechanisms 
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In eukaryotic cell nuclei, a variety of DNA interactions with nuclear elements occur, which, in 
combination with intra- and inter- chromosomal cross-talks, shape a functional 3D architecture. In 
some cases they are organised by active, i.e. actin/myosin, motors. More often, however, they have 
been related to passive diffusion mechanisms. Yet, the crucial questions on how DNA loci recognise 
their target and are reliably shuttled to their destination by Brownian diffusion are still open. 
Here, we complement the current experimental scenario by considering a physics model, in which 
the interaction between distant loci is mediated by diffusing bridging molecules. We show that, in 
such a system, the mechanism underlying target recognition and colocalization is a thermodynamic 
switch-like process (a phase transition) that only occurs if the concentration and affinity of binding 
molecules is above a threshold, or else stable contacts are not possible. We also briefly discuss 
the kinetics of this "passive-shuttling" process, as produced by random diffusion of DNA loci 
and their binders, and derive predictions based on the effects of genomic modifications and deletions. 



Introduction 

In the nucleus of eukaryotic cells, the spatial organi- 
zation of chromosomes has a functional role in genome 
regulation [THE]: DNA loci, for a correct activity, must 
occupy specific, but dynamically changing, positions with 
respect to other DNA sequences or nuclear elements. A 
diverse number of interactions exist but the mechanisms 
whereby distant loci recognize each other and come to- 
gether in complex space-time patterns are still largely 
unknown. Examples are found of loci that undergo di- 
rected motion via active, i.e. actin/myosin-dependent, 
processes [21 l5l I9HI3], However, most examples of cross- 
talks appear to be independent of active motors. There- 
fore, passive diffusion has been proposed as a major, en- 
ergetically inexpensive, mechanism [TJ [2J [7] . Brownian 
mobility induces stochastic collisions of loci, which, in 
turn, establish functional associations, e.g. via bridging 
molecules. Such a scenario, however, raises fundamental 
questions [5J HH [H] • How are these random encounters 
coordinated in space and time? Are they probable? Are 
they reliable for functional purposes? How are they reg- 
ulated? 

Complex regulatory inter-chromosomal contacts occur, 
for instance, in the /3-globin Ty2 Hox clusters [16J ITT] . 
Another striking example is observed during X chromo- 
some inactivation (XCI) in female mammalian cells. At 
the onset of XCI, the X inactivation centre (Xic) regions 
on the two Xs come in close apposition to regulate ex- 
pression of the Xist gene [HI [19]. The Xic interaction 
is mediated by the Tsix/Xite (and Xpr) [20] locus and 
relies on an RNA-protein bridge including CTCF, a zinc- 
finger protein having a cluster of a few dozen binding sites 
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at the locus [18]. Once the different fates of the active 
and inactive X chromosome have been determined, they 
are then targeted to different nuclear positions: the active 
X to the nuclear envelope and the inactive one, by Xist, 
to the nucleolus for maintenance of its silenced state [2T] . 
Many other cases are known. The loop architecture of the 
major histocompatibility complex class I (MHC-I) locus 
on human chromosome 6 [22j is mediated, for instance, 
by a set of specific molecules. Here, chromatin loops 
are organised by SATB1 and PML proteins, and PML- 
associated nuclear bodies, which tether clustered DNA 
binding sites to the nuclear matrix. The number and po- 
sition of these anchoring regions depend on the relative 
abundance of SATB1 and PML protein [22] . For exam- 
ple, whereas Jurkat T cells show five chromatin loops 
within such a region, CHO cells, having a lower expres- 
sion of SATB1, have six loops that also differ in position 
[221 [25]. However, if the SATB1 concentration in CHO 
cells is matched with that of Jurkat T cells, a new loop 
organisation miming that of Jurkat T cells is found [22] . 
Looping of specific remote loci is fundamental for the 
regulation of the Kit gene in erythropoiesis (the produc- 
tion of red blood cells) [23]. In immature erythroid cells, 
where Kit is active, a distal 5' enhancer is shuttled to the 
Kit gene promoter and bound by GATA2 proteins. Upon 
cell maturation, Kit is repressed and the above conforma- 
tion changed: GATA2 is displaced while GATA1 proteins 
and cofactors bring a downstream region to the promoter 
[24] , In this case, the relative expression level of GATA 
proteins acts on the chromatin conformation and controls 
the switch of Kit [23] . Interestingly clusters of binding 
sites are typically involved in most of the above examples 
CO 02 El El [15] . As mentioned earlier, the current question 
concerns the underlying organisational principles of such 
complex systems: how can Brownian random processes 
be finely regulated'.'' How can such a variety of molec- 
ular elements be orchestrated? How do they recognise 



each other from a distance and get brought in apposi- 
tion? Here, we investigate a schematic physics model 
describing the interactions of a DNA locus (modelled as 
a polymer) and a nuclear target (e.g. nucleolus) medi- 
ated by a set of binding Brownian molecules. We show 
that target recognition and colocalization occurs via a 
switch-like thermodynamic mechanism - a phase transi- 
tion - marked by specific thresholds in molecular binders 
concentration and affinity. Below these thresholds, dif- 
fusion is unable to produce colocalization; above these 
thresholds, despite the diffusive nature of motion, colo- 
calization proceeds spontaneously at no energetic cost, 
with resources being provided by the thermal bath. Im- 
portantly, we show that binding energies and concentra- 
tions where the transition happens fall in the relevant 
biological range, whereas the ON-OFF character of the 
transition ensures the full reliability of the process. For 
this reason, this could be seen as a "passive-shuttling pro- 
cess" , where the adjective "passive" should distinguish it 
from the form of shuttling produced by active motors 
(e.g. actin/myosin systems). Thus, our picture can ex- 
plain how well-described cell strategies of upregulation of 
DNA binding proteins or chromatin chemical modifica- 
tions can produce efficient and sharply regulated genomic 
architectural changes. The scenario we depict also has a 
close analogy with the known problem of polymer ad- 
sorption at a surface (see and references therein) . 
We describe the theoretical bases of the mechanism by a 
mean-field analytical approach, which we confirm by ex- 
tensive Monte Carlo computer simulations. Finally, we 
briefly discuss the system kinetics. 

MATERIALS AND METHODS 
The model 

We study two schematic models representing the situ- 
ation where a DNA locus is shuttled towards a different 
nuclear target (e.g. nucleolus, nuclear membrane, ma- 
trix) or to another DNA sequence. In the first case, the 
DNA sequence is represented, via a standard polymer 
physics model, as a floating random walk polymer of n 
beads (fig. § upper panels). The polymer interacts 
with a concentration, c, of Brownian molecular factors 
(MFs) and can be bound at a number, no, of clustered 
binding sites (BSs) with chemical affinity E. In real ex- 
amples, the number and location of binding sites depend 
on the specific locus considered. For definitcness, here 
we refer to the well-studied Tsix/Xite locus of X colo- 
calization and choose the number and chemical affinity of 
binding sites accordingly (see below) . However, as known 
in polymer physics, our results are robust to parameter 
changes (see [55] and below). In our model, a nuclear 
target is also included. It is schematically described as 
an impenetrable surface having a linearly arranged set of 



binding sites for the DNA binding molecules (Fig. [2j up- 
per panels). For the sake of illustration, we assume that 
their number is also no and their affinity E. We use a 
simple lattice version of the random walk polymer model. 
This is well established in polymer physics and has the 
advantage to be simple enough to permit comparatively 
faster simulations with respect to off-lattice models. In 
this way, we can add further degrees of freedom into our 
system, which represent the binding molecules, without 
making computation unfeasible. In fact, molecules are 
dealt with as a statistical mechanics "lattice gas" inter- 
acting with the polymer chain [3U] . We consider a cubic 
lattice of linear size L x — 2L, L y — L and L z = L (in 
units of <io, the characteristic size of a bead on the poly- 
mer; see below), with periodic boundary conditions to 
reduce boundary effects 31j. For the sake of simplicity, 
the DNA sequence is treated as a directed polymer [25] , 
i.e., its tips are bound to move on the top and bottom sur- 
faces of the system volume (Fig. [2|. It comprises n = L 
beads, which randomly move under a "non-breaking" 
constraint: two proximal beads can sit only in the next or 
nearest next neighbouring lattice sites. A bond between 
an MF and a BS can be formed when they are on next 
neighbouring sites; MFs can have multiple bonds (such 
as with CTCF proteins). The use of directed polymers to 
represent DNA segments allows faster simulations with- 
out affecting the general properties of the colocalization 
mechanism we describe because they are produced by a 
general free-energy minimization mechanism, which does 
not depend on such details (see Results and Discussion 
sections). In the case of a non-directed polymer model, 
DNA would bind its target as well, but without a perfect 
alignment as in our model [H] (Fig. § upper panels). 
A strategy to attain a straight alignment anyway would 
be to consider a gradient of BSs along the polymer and 
its target. In real cells, the number and distribution of 
binding sites depend on the specific locus considered but, 
as shown in polymer physics 29,[30]j our thermodynamic 
picture is robust. To investigate the colocalization of two 
DNA sequences, we also consider a variant of a model 
where the nuclear scaffold is removed and a replica of 
the polymer is added We explore these models 

by a statistical mechanics mean-field treatment and by 
Monte Carlo (MC) computer simulations. We try to use 
the available biological data to set the range of model 
parameters. Our models include only minimal ingredi- 
ents and are very schematic, but they permit to derive 
a precise, quantitative picture of passive shuttling. Con- 
versely, our scenario relies on a robust thermodynamic 
mechanism and its general aspects are thus not affected 
by the simplicity of the models. 



DNA binding site number and chemical affinity 

Details on binding energies and DNA locations of bind- 
ing sites are known in some examples (see 35 Ill] and ref- 
erences therein) , but in most cases only qualitative infor- 
mation is currently available. For instance, in vitro mea- 
sures exist [HI H2] for dissociation constants of CTCF 
proteins from DNA binding sites, which give binding en- 
ergies around E ~ 20fcT, k being the Boltzmann con- 
stant and T the room temperature (for example, see [43] 
on how to derive the binding energy from the dissociation 
constant). The precise value of in vivo binding energies 
depends on the specific DNA site considered and can be 
very hard to record, yet these in vitro measurements pro- 
vide the typical energy range. It is experimentally well 
documented that DNA binding proteins, like those men- 
tioned in the Introduction, have a number of target loci 
with chemical affinities in the weak biochemical energy 
range, E ~ - 20fcT [35H40] . This is the energy scale 
we consider here. Here, the BS number no on the DNA, 
as well as on its target, is chosen to be n — 24 (i.e., the 
order of magnitude of the known presence of CTCF sites 
in the Tsix/Xite region on the X chromosome [UJ, but 
it is varied to describe the effects of BSs deletions, see 
fig. [5] inset). 

Molecule concentration 

The order of magnitude of the concentration of molec- 
ular factors, c, can be roughly estimated and compared 
with the concentrations of proteins in real nuclei. In our 
model, the number of molecules per unit volume is c/cIq, 
where d is the linear lattice spacing constant, which 
implies that the molar concentration is p = c/((IqNa), 
where Na is the Avogadro number. Under the assump- 
tion that a polymer bead represents a DNA segment of 
~ 20bp (i.e. of the order of magnitude of a CTCF bind- 
ing site in Tsix/Xite region) [El|44], we obtain the order 
of magnitude of the polymer bead size, do ~ lOnm. By 
using such a value of do, typical concentrations of reg- 
ulatory proteins such as p ~ 10~ 3 — lO^pmol/l (i.e., 
~ 10 3 — 10 5 molecules per nucleus) would correspond to 
volume concentrations in our model of c ~ 10~ 4 — 10~ 2 
percent. Such an estimate is approximate, but could 
guide the connection of our study to real biological situ- 
ations. 



Monte Carlo simulations 

In our Monte Carlo (MC) simulations, we run up to 10 9 
MC steps per simulation and our averages are over up to 
500 runs. At each MC step, the algorithm tries to move, 
on average, all the particles of the system (molecules and 



polymer beads, in random order) according to a tran- 
sition probability proportional to e ~ AH / kT [31] ; where 
AH is the energy barrier of the move. Therefore, the 
binding/dissociation rate is given by the Arrhenius fac- 
tor roe~ AH / kT , where ro is the bare reaction rate. The 
MC time unit (a single lattice sweep) corresponds thus 
to a time ro = r^ 1 [3T]. In turn, To is related to the 
polymer diffusion constant D and to the lattice spacing 
constant d Q : D = ((As 2 )do/4T ) , where (As 2 ) is the 
mean square displacement (expressed in units of do) of 
the polymer center-of-mass per unit MC time. We mea- 
sure (As 2 ) and the value of do can be estimated to be 
of the order of magnitude of a typical protein binding 
site, ~ lOnm (see above). We impose that the diffusion 
coefficient D of a free polymer (i.e. with E = 0) in our 
lattice is of the order of magnitude of the measured dif- 
fusion constant of human DNA loci (D = lp,m 2 /hour) 
|45j . As a result, an MC lattice sweep is found to cor- 
respond to To ~ 30ms (falling well within the range of 
known biological kinetic constants [35] ). The above MC 
simulations produce an artificial dynamic and, in general, 
serious caution must be taken to interpret it as the real 
kinetics. However, in the current prevailing interpreta- 
tion [31], in a system dominated by Brownian motions, an 
MC Metropolis dynamic is supposed to describe well the 
general long-term evolution of the system. Under that 
umbrella, we assume here that MC simulations could pro- 
vide some insight into the system kinetics. We consider 
a lattice with L — 32, i.e. with dimensions L x — 2L = 64 
and L y = L z = L = 32 in units of do- DNA segments 
have n = 32 beads. We also performed simulations with 
different values of L and n (up to L = 128 and n = 128) 
and checked that our general results remained essentially 
unchanged. The conceptual support for using compara- 
tively small self-avoiding walk (SAW) polymer chain sizes 
to extrapolate the behaviour of longer chains is grounded 
in statistical mechanics and relies on the system scaling 
properties [31 . For instance, the transition energy E* 
has a comparatively simple behaviour with variations in 
n |15j and rapidly converges at a large n to a finite value 
comparable with E*(n = 32). Those remarks support 
the idea that our results are not an artefact of the spe- 
cific length of the polymer. 

Results 
Mean-Field theory 

To describe the concept behind passive shuttling and 
colocalization, we briefly discuss the statistical mechan- 
ics of the system at the level of a mean-field, coarse- 
grained approximation 30 . We refer to the polymer 
adsorption literature for more advanced theoretical ap- 
proaches f 25-f2"5] and references therein). For the sake 
of definiteness, we consider the case with two DNA poly- 



mers. We partition the nucleus into two halves and 
name x the probability to find polymer 1 in the right 
half and y the probability to find polymer 2 in the left 
half. In a Ginzburg-Landau approach [301 HZ] , the sys- 
tem free-energy density can be written as a function of 
x and y: F = F(x,y) — H(x,y) — TS{x,y). The in- 
teraction energy density, H, can be expanded in pow- 
ers of x and y to consider the first nontrivial terms: 
H = —Ef, [x(l — y) + y(l — x)]. The above quadratic 
form arises because a molecular bridge between the poly- 
mers can be formed only if they are in the same part of 
the nucleus. Eb is the average binding energy density, 
which at low c and E is approximately the product of the 
density of available binding sites bound by a molecule, 
cri , multiplied by the total chemical affinity of a bridge, 
IE: Eb(c,E,no) cx 2Ecuq. In turn, the entropy, S(x,y), 
in such a mean-field approach can be approximated as 
the sum of the entropies of the two non-interacting poly- 
mers, S(x,y) = S(x) + S(y), where S(x) has the stan- 
dard expression S(x) = — k[xln(x) + (l— x)ln(l— x)] [50] , 
The equilibrium state of the system is obtained (in the 
thermodynamic limit) as the minimum of F(x,y). The 
corresponding equations, d x F — d y F — 0, always have a 
trivial solution (x,y) = (1/2, 1/2), representing the state 
where the polymers have independent and equal prob- 
abilities to be on the left or right side of the nucleus. 
However, if the bridging energy Eb is larger than a crit- 
ical value, E£ = 2kT, the above solution turns into a 
saddle point and two new non-trivial minima arise where 
x = 1 — y = 1/2 (Fig. [T]A_, inset). A second-order phase 
transition [30j occurs at E£, with a consequent sponta- 
neous symmetry breaking: the two minima, i.e. the ther- 
modynamically favoured states, correspond to the colo- 
calization of the polymers on the same side of the nucleus. 
The system order parameter is the polymer excess colo- 
calization probability, p, i.e. the probability to find them 
in the same region minus the probability to be in differ- 
ent regions: p — x(l — y)+y(l — x) — [xy + (1 — x)(l — y)\. 
If Eb < El = 2kT, polymers are independently located 
in the nucleus and p = 0; above the critical point, they 
are more likely to be found together in the same area 
and p > (Fig. [l]^). The critical energy value, ££, cor- 
responds to the point where the entropy loss owing to 
colocalization is compensated by the corresponding en- 
ergy gain. Close to the transition, p has a power law 
behaviour, p ~ [Eb — £^]~' 3 , with a mean-field exponent 
/3 = 1 [3U]. The phase where p > is the "colocal- 
ization phase", whereas for p = 0, polymers move in- 
dependently (the "Brownian phase"). The critical value 
El = 2kT can be written in terms of the model parame- 
ters (c, E, no), providing the following expression for the 
transition surface (Fig[lj3): cnoE/kT = constant. The 
advantage of the above mean-field description is to illus- 
trate the basic ideas of the scenario we propose. How- 
ever, it is very schematic and in the following sections we 
discuss a detailed MC simulation of the model. 



DNA-target colocalization 

The colocalization mechanisms - We first con- 
sider the model describing the system made of the bind- 
ing molecular factors, a polymer and a plane representing 
the surface of a nuclear target (Fig. |2| upper panels) . Be- 
cause the diffusing molecules can bind both the polymer 
and the plane, they can induce an effective attraction 
force between them via the formation of bridges. We il- 
lustrate such an effect by considering the mean square 
distance d 2 (t) between the binding sites (BSs) of the 
polymer sequence and the nuclear target BSs, as a func- 
tion of time, t: 
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where r 2 (z,t) is the square distance between two BSs at 
height z at time t, averaged over all the BSs and over 
different MC simulations (indicated by (...)). We use as 
a normalization constant the mean square distance from 
target expected for a randomly diffusing polymer, r 2 and . 
The system evolves according to the master equation sim- 
ulated by the MC evolution. The DNA polymer is ini- 
tially positioned at a distance L from the target plane in 
a straight, vertical configuration (therefore, the starting 
value of the distance is d 2 (t = 0) = L 2 /r 2 and ~ 2.5) and 
a given concentration c of MFs is randomly positioned in 
the volume. Fig. [2] shows the dynamics of d 2 {t) for two 
values of the interaction energy E (here, c = 0.2% and 
no = 24). When the binding energy E is small enough, 
say E = 1.6fcT, the long-time value of d 2 (t) is equal to 
100%, as expected for a randomly diffusing polymer. In 
principle, a bridge can be stochastically formed by an MF 
but no stable interactions are established, although a fi- 
nite concentration of MFs is present. A drastic change 
in behaviour is observed, however, when the energy is 
raised to E = 2.5kT (fig. [2]): now the long-time plateau 
of d 2 collapses to zero, signalling that a full colocalization 
has occurred. An effective attraction force is generated 
and the polymer spontaneously finds and stably binds 
its target (Fig. [2j upper panels). The dynamics is char- 
acterised at short times by a Brownian diffusion regime 
where d 2 (t) is linear in t (Fig. [2J inset) and the polymer 
randomly explores the space around. During that time, 
it enters into contact with its target. Afterwards, an ex- 
ponential decay of d 2 (t) is observed to the equilibrium 
value and, for a large-enough E, the interaction is sta- 
bilised. A fit function for d 2 (t), which incorporates the 
initial linear and the later exponential regime, is: 
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where gP(oo) is the thermodynamic equilibrium value 
of d 2 and a, b and r are fit parameters (depending on 



the value of c, no, E). In particular, r represents the 
average time needed to reach the asymptotic state, which 
increases as function of c and E as a consequence of the 
decrease in the DNA diffusion constant, as described 
in the next section. Time scales can be affected by 
other complexities (such as chromatin entanglements, 
crowding, etc.) that we do not consider at the level of 
our schematic description here. However, it is interesting 
to note how, although only reasonable guess values are 
used for the system parameters (i.e. molecular concen- 
tration and/or affinity, number of binding sites, etc.; 
see model description) , the values of r predicted by MC 
simulations are compatible with the characteristic time 
scales of cellular processes (r ~ 10 — 10 2 minutes; Fig. 

Target search by diffusion - The dynamics of colo- 
calization is interesting in itself and still experimentally 
largely unexplored. As it is diffusive in nature, the mean 
square displacement from the initial position, (As 2 ), of 
the polymer center-of-mass is a main quantity describ- 
ing the kinetics. Fig. [3] shows (As 2 )(t) for the same 
two values of the interaction energy, E, considered in the 
previous section. For E/kT = 1.6 (i.e., when no sta- 
ble colocalization is observed), (As 2 )(t) has the typical 
Brownian linear behaviour with t, at short as well as at 
long times (Fig. [3]); overall, the polymer motion is unaf- 
fected by the presence of binding molecules and nuclear 
scaffold. For E/kT — 2.5, Brownian diffusion is instead 
only found at short t while the polymer is searching for its 
target (Fig. [3J inset); at longer times, (As 2 )(i) reaches 
a constant plateau, which signals that the polymer has 
become firmly bound to the scaffold BSs and cannot dif- 
fuse anymore (Fig. pi). 

The inset in Fig. [3] shows both the short- and long- 
time diffusion constants D = (As 2 ) (t) / At (named, re- 
spectively, D and -Doo) as a function of the energy E. 
Dq(E) has a smooth decreasing behaviour with E be- 
cause the larger the binding energy the higher the num- 
ber of MF bound to the polymer. The long-time diffusion 
constant, D^E), has a different behaviour. When E is 
small, D^E) is very close to D (E), showing that the 
polymer motion is diffusive at all times. However, above 
a transition point, E tr < 2.1fcT, D ao {E) collapses to zero 
as a result of the attachment of the DNA segment to the 
fixed scaffold, which stops further diffusion (Fig. [3j in- 
set). 

The switch for colocalization - As much as the 
squared distance, d , we consider the probability, p, that 
the polymer is bound to its nuclear target, i.e. its mean 
distance from the scaffold BSs is less than 10% of the 
lattice linear dimension L. The equilibrium values of 
d 2 and p have a similar threshold behaviour as func- 
tions of E, as illustrated in Fig. [4] At small values 
of E, we find d 2 ~ 100% and p ~ 0; conversely, for 
E larger than a threshold E* ~ 2.1&T, d 2 ~ and 



p ~ 100%, showing that stable binding to the target 
has occurred. The critical value E* is here defined by 
the criterion p{E*) = 50% and is numerically consis- 
tent with the threshold, E tr , found for D^E). At ap- 
proximately E* , d 2 and p have an intermediate value 
between that of the Brownian phase (d 2 ~ 100% and 
p ~ 0%) and that of the colocalization phase (d 2 ~ 0% 
and p ~ 100%); in this crossover regime (typical of phase 
transitions in finite systems) [SOj, only partially stable 
bridges are built between polymer and nuclear target. 
Whereas the sharpness of the crossover region is known 
to increase in the thermodynamic limit [3Tj, we found 
that it does not depend on the specific value of the sys- 
tem parameters (i.e. molecule concentration and affinity, 
and number of DNA target binding sites) . As predicted 
by mean-field theory, colocalization can also be triggered 
by an above-threshold concentration of MFs, c, and BS 
number uq. This is illustrated by the phase diagram in 
(c, E, no) space (Fig.[5j inset). The transition surface be- 
tween the two phases has been obtained by a power-law 
fit of numerical data: c(E — E) a (no — no)' 3 = constant, 
with a = 4.5, f3 = 1.2, E = kT and n = 8. From these 
data, we derive that, for typical concentrations of regu- 
lating proteins (i.e. c ~ 10~ 4 — 10~ 2 percent, see above), 
the transition energies fall in the range E ~ 3 — 7kT, 
which is well within the interval of typical DNA-protein 
affinities found in the literature (E ~ — 20fcT). Note 
that in a real population of cells, the fraction of colocal- 
ized sequences is expected to be smaller than in our in 
silico model for several reasons, such as the lack of full 
synchronization (DNA colocalization can be induced or 
released at different times in different cells, whereas our 
system is perfectly synchronized). 

Non linear effects of deletions - In our model, 
a variation in no can describe the deletion of a frac- 
tion of DNA binding sites, and we consider now the case 
where the BSs on the polymer are reduced by an amount, 
An , with respect to the wild- type number nO. Dele- 
tions have a non- linear effect, characterized by a thresh- 
old behaviour. The equilibrium value of p has a sigmoid 
shape, Ano/no, with a threshold of ~ 50% (Fig. [6]): short 
deletions (e.g. with Ano/no < 50%) do not result in 
a relevant reduction of p, whereas colocalization is lost 
as soon as An /no gets larger than such a threshold. 
This ON/OFF behaviour stems from the nontrivial ther- 
modynamic origin of the MF- mediated effective attrac- 
tion between polymer and nuclear target. The threshold 
value no is a decreasing function of E and c, as seen 
in the phase diagram in the Fig. [5] inset. These re- 
sults are predicting a nontrivial effect of deletions that 
could be tested experimentally. Similar results were also 
found for transgenic insertions and related ectopic asso- 
ciation (data not shown). In summary, we showed that 
binding MFs induce an effective attraction between the 
BSs on the DNA polymer and the other nuclear element, 
whereby the DNA segment is brought in close apposition 



with the target. The attraction, however, is only present 
if the MF concentration, the BS number and the MF-BS 
interaction energy are above a threshold value, otherwise 
the DNA segment randomly diffuses into the lattice (Fig. 

Role of non-specific binding sites - It is inter- 
esting to try to describe the effects on the colocalization 
mechanism of the presence of a number of non-specific 
binding sites on DNA and/or its target. The problem 
of how sequence- specific proteins can find their DNA 
sites on very large eukaryotic genomes is ancient (for 
example, see [48-50 ). It has been proposed that the 
presence of non-specific binding sites allows a mixture of 
one-dimensional diffusion of bridging molecules along the 
DNA and three-dimensional diffusion in the surrounding 
medium, which could result in a more efficient search 
of the DNA target sites than a purely one- or three- 
dimensional diffusion [351 |5*TH5"4"] . Conversely, binding 
of molecules to these sites is expected to impair shuttling 
by the reduction of the effective concentration of diffus- 
ing molecular mediators. We tested the effect of the pres- 
ence of non-specific sites in our schematic model: along 
with the clusters of specific sites previously included on 
the polymer and on its target, we inserted up to 4A10 4 
non-specific (i.e., low affinity) binding sites distributed 
on the target surface and within the polymer itself. We 
performed Monte Carlo simulations to find out the equi- 
librium status of the system as function of the molecular 
concentration c and the specific binding energy E, with a 
fixed affinity for non-specific sites equal to Ens = 1.5fcT. 
Fig. [5] shows the changes in the phase diagram with re- 
spect to the case Ens — 0. Orange squares and green 
circles mark the transition points between the Brownian 
and the colocalization phase, respectively, for Ens = 
(the case we dealt with previously) and Ens — 1.5kT. 
The plot reveals that the presence of non-specific binding 
sites moves the transition line upwards. This is due to a 
reduction in the effective concentration of molecules that 
are available to the specific sites, responsible for recogni- 
tion and attachment to the target. This effect can be im- 
portant and affects the location of the transition line even 
for comparatively small affinities (e.g. Ens — 1.5AT; 
Fig. [5]). However, the overall colocalization mechanism 
we discussed before is shown to be very robust. 

Colocalization of DNA loci 

Similar mechanisms act to shuttle a DNA segment 
to another DNA segment [33J IS]- We illustrate this 
by considering now a model that includes two polymers 
[331 [34]. As before, two regimes are found: when E, c 
and no are below threshold, the polymers float indepen- 
dently; above threshold, they colocalize. When the colo- 
calization machinery is switched on, the DNA segments 
will inevitably find and bind each other. Fig. [7j3 illus- 



trates different stages of the dynamics leading to colo- 
calization: the polymer centers-of-mass are highlighted 
in green, whereas the darker green lines trace the tra- 
jectory they spanned up to that moment. The pictures 
show how, once their initial Brownian diffusion brings 
the DNA segments close enough (see t = 0.5 minutes), 
i.e. within the range of the effective attraction induced by 
the MFs, they colocalize (t = 5 minutes) and begin to dif- 
fuse together in the lattice (t = 50 minutes). The mean 
square displacement (As 2 )(t) of the centers-of-mass of 
the DNA segments is plotted in Fig. [7^. At low energy 
(e.g., E/kT — 1.4), when the colocalization machinery is 
off, Brownian motion has approximately the same diffu- 
sion constant at short and long times. At higher energy 
(e.g., E/kT = 1.9) two dynamical regimes are found: an 
initial one when the two polymers diffuse independently 
and a longer, slower diffusion when they move bound to 
each other. Such a behaviour is captured by a plot of the 
short- and long-time diffusion constants, Dq and D OQl as 
function of E (Fig. [7]A, inset). As shown earlier, D (E) 
decreases with E, and D^E) follows it. The transition 
point, E tr ., is marked by a drastic reduction of D co {E) 1 
whereas no major changes are found in the behaviour of 
Dq(E). Above E tr , D 00 (E) is non-zero as the two paired 
DNA segments continue to diffuse, although with a dif- 
fusion constant that is some orders of magnitude smaller 
than in the free case (Fig. [7j3) . Such a large reduction 
is due to the much larger mass of the diffusing object in 
the colocalized state, which is formed by the couple of 
polymers and by a number of attached molecules. 

Discussion 

In the cell nucleus, in a striking example of self- 
organization, the architecture of a vast number of DNA 
and nuclear loci is orchestrated to form complex and 
functional patterns involving regulatory cross-talks. In 
most cases, active processes are not required for colo- 
calization [TJ [H [7] and questions arise on how DNA se- 
quences recognize their targets and establish their rel- 
ative positioning, and how the cell can control these 
processes. Via a schematic statistical mechanics model, 
here we tried to address these questions and to propose 
a first quantitative scenario of a colocalization mech- 
anism based on weak, biochemically unstable interac- 
tions between specific DNA sequences and their molecu- 
lar binders. The mere production of molecules that bind 
both DNA and target is not sufficient to produce reli- 
able and stable contacts. We showed they are activated 
only above a phase transition point, i.e. for concentra- 
tion and affinity of the molecular mediators above precise 
threshold values (e.g., molecule concentrations around 
p ~ 10~ 3 — 10 _1 urnol/l correspond to transition energies 
in the range E ~ 3 — 7kT). Once these conditions are 
met, DNA loci find their relative positions as stable ther- 



modynamic states at no energetic costs, as the resources 
required are provided by the surrounding thermal bath 
(Fig. [8]) . The switch-like nature of the mechanism of tar- 
get recognition and colocalization we discussed could be 
exploited in the cell to reliably induce loci colocalization. 
In fact, well-known cell strategies of chromatin structure 
modification (i.e. change in E or uq) or upregulation of 
binding proteins (i.e. change in c) can produce precise, 
switch-like architectural rearrangements. Deep similari- 
ties are found across a variety of experimental data like 
those discussed in the Introduction, including specific as- 
pects such as the effects of protein concentration changes 
on DNA looping (for examples, see [22j [24]. The ro- 
bust thermodynamic essence of the process we discuss 
could support the idea that passive shuttling phenom- 
ena can be traced back to simple universal mechanisms 
[2"5T - [2"81 [3D], in a sense independent of the biochemical 
details found in specific cases. Conversely, many com- 
plexities can arise in real cell nuclei, where a variety of 
other specific mechanisms are likely to intervene. For 
sake of definiteness, we referred to DNA, but similar 
thermodynamic mechanisms could work for other bio- 
logical polymers such as RNA, etc. Non-specific molec- 
ular factors and non-specific DNA binding can further 
assist the search kinetics [SH [53j [54] , whereas other pro- 
cesses can intervene (e.g. to stabilize binding and to ad- 
just DNA-target alignment if necessary). We also showed 
that non-specific binding sites on DNA and/or on its tar- 
get can have an important effect on colocalization, yet the 
general scenario depicted above is unchanged. Testable 
predictions about the outcomes of, for example, genetic 
and/or chemical manipulations (such as DNA deletions), 
can be made, which can be tested against experimental 
data. We tried to set the system parameters (e.g., the 
molecule concentration, the dynamics time scale) in a 
regime relevant to the real biological cases (see model de- 
scription). Nevertheless, our model is very schematic and 
we included only the minimal molecular ingredients (i.e., 
molecular binders and specific DNA sites) that emerge 
from the experiments. However, a simple model could 
better serve the purpose to illustrate the core ingredi- 
ents necessary for DNA target recognition (which can 
be traced back to polymer adsorption) and to depict a 
schematic, yet quantitative, scenario. 
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Figure 1: Thermodynamics of colocalization. The mean-field theory description of DNA-target colocalization. Panel 
A The polymer excess colocalization probability, p, is plotted as a function of the average binding energy density Et. The 
inset shows the probability x, to find polymer 1 in the right half of the nucleus. The plots show how, at Et/kT = 2, a 
transition occurs between a phase where polymers are independently located in space (p = 0%, x — 50%) and a phase where 
they colocalize (p > 0%, x 7^ 50%). Panel B The transition surface cnoE^/kT = constant is depicted in the space of molecule 
concentration, c, binding energy, E, and number of binding sites, no. 
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Figure 2: Kinetics of colocalization. The normalized mean square distance, d 2 , between the DNA and the nuclear target 
binding sites (BSs) is plotted as a function of time, t, for two values of their binding molecule chemical affinity, E (here c = 0.2% 
and no = 24). Data are from Monte Carlo (MC) simulations. d 2 (t) has a linear diffusive behaviour at short t (inset) and a 
long-time exponential approach to an equilibrium value. The latter corresponds to colocalization only if E is above a threshold 
(see text and Fig. |4|. The upper panels show system configurations from MC simulations at three time periods for E — 2.5kT 
and provide a pictorial representation of our model: the DNA locus is modelled as a SAW polymer made by "beads" that have 
an affinity equal to or to E (red beads and green beads, respectively) for Brownianly diffusing molecules (yellow beads). A 
cluster of binding sites is also present on the nuclear target (blue surface). 
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Figure 3: DNA diffusion, recognition and colocalization to a nuclear target. The mean square displacement, (As 2 ), 
of the center-of-mass of the DNA polymer is plotted as function of time. While at E/kT — 1.6 (squares) (As 2 ) is linear in t 
at short as well as at long times, with the same diffusion constant (Do and Z?oo, inset), for E/kT = 2.5 (circles) around t — 2 
hours, (As 2 )(i) reaches a plateau, showing that a stable contact with the target is established and diffusion is interrupted. In 
the inset, the short- and long-time diffusion constants, Do and D^,, are shown as a function of E (normalized by Do(E — 0)). 



A 





100 




80 




60 


(N 






40 




20 








B 



Brownian 
Diffusion 


E*/kT 


\ Attachment 
Q to Target 



1.6 



2 

E/kT 



2.5 





100 




80 








60 


Oh 


40 




20 



Attachment 
to Target 

o-o-o-o 



0Q: 




Brownian 
Diffusion 



1.6 



2 

E/kT 



2.5 



Figure 4: The switch for DNA-target colocalization. Panel A The equilibrium normalized square distance, d 2 , between 
the DNA polymer and its nuclear target is shown as a function of E/kT, the binding molecule affinity. At small E, d 2 has 
a value corresponding to random diffusion (d 2 = 100%); above a threshold E ~ E* — 2.1fcT (blue vertical arrow), a phase 
transition occurs and d 2 collapses to zero (blue horizontal line), indicating that DNA and target are colocalized. Molecule 
chemical affinity (or concentration, see Fig. [5| acts as a switch. Here, c = 0.2% and no = 24. Panel B The attachment 
probability, p, of DNA to target increases correspondingly from 0% to 100%. 
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Figure 5: Colocalization-state diagrams. The system phase diagram in the main panel shows the regions in the (c, E) 
plane where DNA attachment to target and Brownian diffusion occur (here no = 24), in the presence (green circles) or absence 
(orange squares) of non-specific binding sites with a low affinity for molecular binders (Ens = 1.5fcT). In the inset, the full 3D 
phase diagram in the (c, E, no) space is shown. The transition surface (grey) is a power-law fit (see text). 
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Figure 6: Non-linear effects of deletions. After deletion of a fraction, Ano/no, of DNA binding sites, the probability, p, of 
DNA-target colocalization is changed, p has a non-linear behaviour with Ano: colocalization is only impaired by above-threshold 
deletions (here E = 2.5kT, c = 0.2% and n = 24). 
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Figure 7: Diffusion and colocalization of two DNA loci. This figure illustrates the colocalization of two DNA segments. 
Panel A The mean square displacement of the center of mass of one of the DNA segments, (As 2 )(t), is plotted as a function of 
time t. At small binding energies (squares), {As 2 )(t) has a linear diffusive behaviour at all times as no colocalization occurs. At 
higher energies (circles), two different diffusive regimes are found at short and long time scales, before and after colocalization. 
In the inset, the diffusion constants at short and long time scales (Do and Doo) are shown as function of E. Panel B 2D 
projections of the system trajectory from a Monte Carlo simulation showing the initial Brownian diffusion of two separated 
DNA loci (t — 0.5 minutes) and their colocalization (t = 5 minutes and t = 50 minutes). 
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Figure 8: Pictorial representation of the mechanism whereby molecular factors mediate DNA-target recognition 
and colocalization. The DNA-target co-localization mechanism here investigated has a thermodynamic origin. It occurs as a 
switch-like process only when the concentration and the affinity of molecular binders exceed a threshold value corresponding 
to a phase transition (in a finite-sized system). Conversely, below the threshold, stable colocalization is thermodynamically 
impossible and the loci diffuse independently (see phase diagram in Figs. [I] and pj. The process has no energy costs, with 
resources being provided by thermal bath. 



